Draft version September 3, 2012 

Preprint typeset using I4TgX style emulateapj v. 5/2/1 1 



MEASUREMENT OF 21 CM BRIGHTNESS FLUCTUATIONS AT z - 0.8 IN CROSS -CORRELATION 

K. W. MASUi''-, E. R. SWITZEr''\ N. BANAVAr\ K. BANDURA^ C. BLAKE^ L.-M. CALIn', T.-C. CHANg', X. CHEn'*'', Y.-C. Li\ 

Y.-W. LiAO^ A. Natarajan'", U.-L. Pen', J. B. Peterson'", J. R. Shaw', T. C. Voytek'" 

Draft version September 3, 2012 

ABSTRACT 

In this letter, 2 1 cm intensity maps acquired at the Green Bank Telescope are cross-correlated with large-scale 
structure traced by galaxies in the WiggleZ Dark Energy Survey. The data span the redshift range 0.6 < z < 1 
over two fields totaling ^ 41 deg. sq. and 190 hours of radio integration time. The cross-correlation constrains 
^mbmr = [0.43 ± 0.07(stat.) ± 0.04(sys.)] x 10^^, where Ohi is the neutral hydrogen (H l) fraction, r is the 
galaxy-hydrogen correlation coefficient, and feni is the H I bias parameter. This is the most precise constraint on 
neutral hydrogen density fluctuations in a challenging redshift range. Our measurement improves the previous 
21 cm cross-correlation at z ~ 0.8 both in its precision and in the range of scales probed. 
Subject headings: galaxies: evolution — large-scale structure of universe — radio lines: galaxies 



1. INTRODUCTION 

Measurements of neutral hydrogen are essential to our un- 
derstanding of the universe. Following cosmological reion- 
ization at z 6, the majority of hydrogen outside of galax- 
ies is ionized. Within galaxies, it must pass through its neu- 
tral phase (H l) as it cools and collapses to form stars. The 
quantity and distribution of neutral hydrogen is therefore inti- 
mately connected with the evolution of stars and galaxies, and 
observations of neutral hydrogen can give insight into these 
processes. 

Above redshift z = 2.2, the Lyman-a line redshifts 
into optical wavelengths and H I can be observed, typically 
in ab sorption against distant quasars (iProchaska and Wolfd 
l2009h . Below redshift z = 0.1, Hi has bee n studied us- 
ing 2 1 cm emission from its hyperfine splitting (IZwaan et al.\ 
120051: iMartin et a/.ll20ldl) . There, the abundance and large- 
scale distribution of neutral hydrogen are inferred from large 
catalogs of discrete galactic emitters. Between z — 0.1 and 
z = 2.2 there are fe wer constraints on neutral hydrogen , 
and those that do exist (iMeiring ef a/.ll201 It iLah et a/.ir2007t 
iRao et a/l l2006) have large uncertainties. 

While the 21cm line is too faint to observe individual 
galaxies in this redshift range, one ca n nonetheless pur- 



sue three-dimensional ii itensity mapping (IChang et al. 
iLoeb and Wy"Mi^l200l lAnsari et a/.l 120121 iMao et al. 



2008 



2008 



' Canadian Institute for Theoretical Astrophysics. University of 
Toronto, 60 St. George St.. Toronto, Ontario, M5S 3H8 

^ Department of Physics, University of Toronto, 60 St. George St., 
Toronto, Ontario, M5S 1A7, Canada 

^ Kavli Institute for Cosmological Physics, University of Chicago, 5640 
South Ellis Avenue, Chicago, IL 60637, USA 

* Department of Astronomy & Astrophysics, University of Toronto, 50 
St. George St., Toronto, Ontario, M5S 3H4, Canada 

^ Department of Physics, McGill University, 3600 Rue University, Mon- 
treal, Quebec, H3A 2T8, Canada 

' Centre for Astrophysics & Supercomputing, Swinburne University of 
Technology, P.O. Box 218, Hawthorn, VIC 3122, Australia 

' Academia Sinica Institute of Astronomy and Astrophysics, PO Box 
23-141, Taipei, 10617, Taiwan 

^ National Astronomical Observatories, Chinese Academy of Science, 
20A Datun Road, Beijing 100012, China 

' Center of High Energy Physics, Peking University, Beijing, 100871, 
China 

'" McWilliams Center for Cosmology, Carnegie Mellon University, De- 
partment of Physics, 5000 Forbes Ave., Pittsburgh PA 15213, USA 



ISeo et al.\ 120101: iMaol 12012b . Instead of cataloging many 
individual galaxies, one can study the large-scale structure 
(LSS) directly by detecting the aggregate emission from many 
galaxies that occupy large ^ 1000 Mpc'^ voxels. The use of 
such large voxels allows telescopes such as the Green Bank 
Telescope (GBT) to reach z ^ 1, conducting a rapid survey 
of a large volume. 

Synchrotron foregrounds are the primary challenge to this 
method because they are three orders of magnitude brighter 
than the 21cm signal. However, the physical process of 
synchrotr on emission is known to produce spec trally-smooth 
radiation jOh and Mackl 120031: ISeo ef a/.ll20Tol) . If the cal- 
ibration, spectral response and beam width of the instru- 
ment are well-controlled and characterized, the subtraction 
of foregrounds should be possible because the foregrounds 
have fewer degrees of freedom than the cosmological sig- 
nal. We find that this allows the foregrounds to be cleaned 
to the level of the expected signal. The auto-correlation of 
intensity maps is biased by residual foregrounds, and mini- 
mizing and constraining these residuals is an active area of 
work. However, because residual foregrounds should be un- 
corrected with the cosmological signal, they only boost the 
noise in a cross-correlation with existing surveys. This makes 
the cross-correlation a robust indication of neutral hydrogen 
density fluctuations in the 2 1 cm intensity maps (Chang et oZ] 
l2010l:IVuianovic et al. 

The first detection of the cross-correlation bet ween LSS and 
21 cm intensity maps at z 1 was reported in IChang et al.\ 
(1201 Oh . based on data from GBT and the DEEP2 galaxy sur- 
vey. Here we improve on these measurements by cross cor- 
relating new int ensity mapping data wit h the WiggleZ Dark 
Energy Survey (iDrinkwater et a/.ll2010l) . Our measurement 
improves on the statistical precision and range of scales of the 
previous result, which was based on 15 hrs of GBT integration 
time over 2 deg. sq. 

Throughout, we use cosmologic al param e ters from 
iKomatsu ef g/l (l2009i) . in accord with Blake et al.\ (1201 lb . 

2. OBSERVATIONS 

The observations presented here were conducted with the 
680-920 MHz prime-focus receiver at the Green Bank Tele- 
scope (GBT). The unblocked aperture of GBT's 100 m off- 
set paraboloid design results in well-controlled side-lobes and 
ground spill, advantageous to minimizing radio-frequency 
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contamination and overall system temperature 25 K). The 
receiver is sampled from 700 MHz (z = 1) to 900 MHz 
(z = 0.58) by the Green Bank Ultimate Pul sar Processing 
Instru ment (GUPPI) pulsar back-end systems (iDuPlain et al.\ 
l2008h . 

The data used in this analysis were collected between 
February and November 2011 as part of a 400 hr allocation 
over four fields. This allocation was specifical ly to corrob- 
orate previous cross-correlation measurements jChang et al.\ 
over a larger survey area, and to search for auto-power 
of diffuse 21 cm emission. The analysis here is based on a 
105 hr integration of a 4.5° x 2.4° "15 hr deep field" centered 
at 14''31'"28.5" right ascension, 2°0' declination and an 84 hr 
integration on a 7.0° x 4.3° "1 hr shallow" field centered at 
Qh52™0" right ascension, 2°9' decHnation. The beam FWHM 
at 700 MHz is 0.314° and at 900 MHz it is 0.25°. At band- 
center, the beam width corresponds to a comoving length of 
9.6 /i^^Mpc. Both fields have nearly complete angular over- 
lap and good redshift coverage with WiggleZ. 

Our observing strategy consists of sets of azimuthal scans 
at constant elevation to control ground spill. We start the set 
at the low right ascension (right hand) side of the field and 
allow the region to drift through. We then re-point the tele- 
scope to the right side of the field and repeat the process. For 
the 15 hr field, this set of scans consists of 8 one-minute scans 
each with a stroke of 4°. For the 1 hr field, a set of scans 
consists of 10 two-minute scans, each 8° in length. Note that 
since we observe over a range of local sidereal times, our scan 
directions cover a range of angles with respect to the sky. This 
range of crossing angles makes the noise more isotropic, and 
allows us to ignore the directional dependence of the noise in 
the 3D power spectrum. The survey regions have most cov- 
erage in the middle due to the largest number of intersect- 
ing scans. Observations were conducted at night to minimize 
radio-frequency interference (RFI). 

T he optical data are part o f the WiggleZ Dark Energy Sur- 
vey (iDrinkwater et a/.ll2010l) . a large-scale spectroscopic sur- 
vey of emission-line galaxies selected from UV and optical 
imaging. It spans redsh iftsO.2 < z < 1.0 across 1000 sq. deg. 
The selection function (IB lake et a/.ll2010l) has angular depen- 
dence determined primarily by the UV selection, and redshift 
coverage which favors the z = 0.6 end of the radio band. The 
galaxies are binned into volumes with the same pixelization 
as the radio maps and divided by the selection function, so 
that we consider the cross-power with respect to optical over- 
density. 

3. ANALYSIS 

Here we describe our analysis pipeline, which converts the 
raw data into 3D intensity maps, then correlates these maps 
with the WiggleZ galaxiesO 

3.1. From data to maps 

The first stage of our data analysis is a rough cut to miti- 
gate contamination by terrestrial sources of RFI. Our data na- 
tively have fine spectral resolution with 4096 channels across 
200 MHz of bandwidth. This facilitates the identification and 
flagging of RFI. In each scan, individual channels are flagged 
based on their variance. Any RFI not sufficiently prominent 
to be flagged in this stage is detected as increased noise later 

" Our analysis software is publicly available at 

https : //github . com/kiyo-masui/analysis_IM 



in the pipeline and subsequently down-weighted during map- 
making. Additional RFI is detected as frequency-frequency 
covariance in the foreground cleaning and subtracted in the 
map domain. While RFI is prominent in the raw data, after 
these steps, it was not found to be the primary limitation of 
our analysis. 

In addition to RFI, we also eliminate channels within 
6 MHz of the band edges (where aliasing is a concern) and 
channels in the 800 MHz receiver's two resonances at roughly 
798 MHz and 8 17 MHz. Before mapping, the data are re- 
binned to 0.78MHz-wide bands (corresponding to roughly 
3.8 /i^^Mpc at band-center). 

For a time-transfer calibration standard, we inject power 
from a noise diode into the antenna. The noise diode raises the 
system temperature by roughly 2 K and we switch it at 16 Hz 
so that the noise power can be cleanly isolated. Calibration 
is performed by first dividing by the noise diode power (av- 
eraged over a scan) in each channel, and then converting to 
flux using dedicated observations of 3C286 and 3C48. The 
gain for X and Y polarizations may differentially drift and so 
these are calibrated independently. Our absolute calibration 
uncertainty is dominated by the calibration of the reference 
flux scale (5%, jCellermann ef a/. (1969)), measurements of 
the calibration sources with r espect to this reference (5%, see 
also IScaife and Healdl ( 120121) ). and uncertainty of our mea- 
surement of these fluxes (5%). Receiver nonlinearity, uncer- 
tainty in the beam shape and variations in the diffuse galac- 
tic emission in the on- and off-source measurements are esti- 
mated to contribute of order 1% each. These are all assumed 
to be uncorrected errors and give 9% total calibration system- 
atic error. 

Gridding the data from the time ordered data to a map is 
done in two sta ges. We fo l low CM B map-making conventions 
as described in iT egm^^ ( 119971) . The map maker treats the 
noise to be uncorrected except for deweighting the mean and 
slope along the time axis for each scan. Each frequency chan- 
nel is treated independently. In the first round of map-making, 
the noise is estimated from the variance of the scan. This is 
inaccurate because the foregrounds dominate the noise. This 
yields a sub-optimal map which nonetheless has high a signal- 
to-noise ratio on the foregrounds. This map is used to esti- 
mate the expected foreground signal in the time ordered data 
and to subtract this expected signal, leaving time ordered data 
which are dominated by noise. After flagging anomalous data 
points at the Aa level, we re-estimate the noise and use this 
estimate for a second round of map-making, yielding a map 
which is much closer to optimal. In reality, it is a bad as- 
sumption that the noise is uncorrected. We have observed 
correlations at finite time lag and between separate frequency 
channels in our data. Exploiting these correlations to improve 
the optimality of our maps is an area of active research. For 
all map-making, we use square pixels with widths of 0.0627°, 
which corresponds to a quarter of the beam's FWHM at the 
high frequency edge of our band. Fig. [T]shows the 15 hr field 
map. 

In addition to the observed maps, we develop signal-only 
simulations based on Gaussian realizations of the non-linear, 
redshift-space power spectrum using the empirical-NL model 
described by Blake e t al. (2011). 

3.2. From maps to power spectra 

The approach to 21 cm foreground subtraction in literature 
has been dominated by the notion of fitting and subtracting 
smooth, orthogonal polynomials along each line of sight. This 
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Figure 1. Maps of the GBT 15 hr field at approximately the band-center. The puiple circle is the FWHM of the GBT beam, and the color range saturates in 
some places in each map. Left: The raw map as produced by the map-maker It is dominated by synchrotron emission from both extragalactic point sources 
and smoother emission from the galaxy. Right: The raw map with 20 foreground modes removed per line of sight relative to 256 spectral bins, as desciibed in 
Sec. 13. 21 The map edges have visibly higher noise or missing data due to the sparsity of scanning coverage. The cleaned map is dominated by thermal noise, and 
we have convolved by GBT's beam shape to bring out the noise on relevant scales. 



is motiva ted by the eigenvectors of smo oth synchrotron fore- 
grounds jLiu and Tegmarkll201 IL l2012h . In practice, instru- 
mental factors such as the spectral calibration (and its stabil- 
ity) and polarization response translate into foregrounds that 
have more complex structure. One way to quantify this struc- 
ture is to use the map itself to build the foreground model. 
To do this, we find the frequency-frequency covariance across 
the sample of angular pixels in the map, using a noise inverse 
weight. We then find the principal components along the fre- 
quency direction, order these by their singular value, and sub- 
tract a fixed number of modes of the largest covariance from 
each line of sight. Because the foregrounds dominate the real 
map, they also dominate the largest modes of the covariance. 

There is an optimum in the number of foreground modes to 
remove. For too few modes, the errors are large due to resid- 
ual foreground variance. For too many modes, 21 cm signal 
is lost, and so after compensating based on simulated signal 
loss (see below), the errors increase modestly. We find that 
removing 20 modes in both the 15 hr and 1 hr field maximizes 
the signal. Fig. [T] shows the foreground-cleaned 15hr field 
map. 

We estimate the cross-power spectrum using the inverse 
noise variance of the maps and the WiggleZ selection function 
as the weight for the radio and optical survey data, respec- 
tively. The variance is estimated in the mapping step and rep- 
resents noise and survey coverage. The foreground cleaning 
process also removes some 21 cm signal. We compensate for 
signal loss using a transfer function based on 300 simulations 
where we add signal simulations to the observed maps (which 
are dominated by foregrounds), clean the combination, and 
find the cross-power with the input simulation. Because the 
foreground subtraction is anisotropic in k±_ and we esti- 
mate and apply this transfer function in 2D. The GBT beam 
acts strictly in fcj^, and again we develop a 2D beam transfer 
function using signal simulations with the beam. 

The foreground filter is built from the real map variance, 
and so is slightly nonlinear in the signal. This has two primary 
consequences for the compensation. One is that the transfer 
function needs to be derived from realistic signal amplitudes. 
In practice, we find that the conclusions for the cross-power 
change negligibly under a halving of the assumed signal am- 
plitude, and that this nonlinearity is a secondary effect. The 
second consequence is that the cleaned foregrounds are anti- 
correlated with the signal because signal covariance also en- 



ters the cleaning mode functions. This is accounted for in our 
transfer function. Subtleties of the cleaning method will be 
described in a future methods paper 

We estimate the errors and their covariance in our cross- 
power spectrum by calculating the cross-power of the cleaned 
GBT maps with 100 r andom catalogs dr awn from the Wig- 
gleZ selection function ( iBlake et a/.l2010l) . The mean of these 
cross powers is consistent with zero, as expected. The vari- 
ance accounts for shot noise in the galaxy catalog and vari- 
ance in the radio map either from real signal (sample vari- 
ance), residual foregrounds or noise. Estimating the errors in 
this way requires many independent modes to enter each spec- 
tral cross-power bin. This fails at the lowest k values and so 
these scales are discarded. In going from the two-dimensional 
power to the ID powers presented here, we weight each 2D k- 
cell by the inverse variance of the 2D cross-power across the 
set of mock galaxy catalogs. The 2D to ID binning weight is 
multiplied by the square of the beam and foreground clean- 
ing transfer functions. Fig. |2] shows the resulting galaxy-Hl 
cross-power spectra. 
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Figure 2. Cross-power between the 15 hr and 1 hr GBT fields and WiggleZ. 
Negative points are shown with reversed sign and a thin line. T he solid line 
is the mean of simulations based on the empirical-NL model of IBlake et al\ 
1201 II) processed by the same pipeline. 
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4. RESULTS AND DISCUSSION 



To relate the measured spectra with theory, we start with the 
mean 21cm emission brightness temperature (IChang et al\ 
l2OT0h . 



n = 0.29 



n 



HI 



10- 



(1 



0.37 



1 + z 
1.8 



mK. 

(1) 

Here f^ni is the comoving H I density (in units of today's crit- 
ical density), and ftm and rt\ are evaluated at the present 
epoch. We observe the brightness contrast, ST = Tf,(5Hi, from 
fluctuations in the local H I over-density Jhi- On large scales, 
it is assumed that neutral hydrogen and optically-selected 
galaxies are biased tracers of the dark matter, so that Sm = 
bmS, and Jopt = b^ptS. In practice, both tracers may contain 
a stochastic component, so we include a galaxy-H I correla- 
tion coefficient r. This quantity is scale-dependent because 
of the /j-dependent ratio of shot noise to large-scale structure, 
but should approach unity on large scales. The cross-power 
spectrum is then given by Phi, opt (fc) = TbbmboptrPssik) 
where Pss{k) is the matter power spectrum. 

The large-scale matter power spectrum is well-known 
from cosmic micr owave background measurements 
dKomatsu et al.\ 1201 ll) and the bias of the optical galaxy 
population is measured to be b^^^^^_i_A8 ± 0.08 at the 
central redshift of our survey (iBlake et al.\ l20Tlh . Sim- 
ulations including nonlinear scales (as in Sec. 13.11 ) are 
run through the same pipeline as the data. We fit the 
unknown prefactor JIhi&hi?' of the theory to the measured 
cross-powers shown in Fig. |2] and determine f^ni^Hi'' = 
[0.44 ± 0.10(stat.) ± 0.04(sys.)] x lO'^ for the 15 hr field 
data, andOHi^Hir = [0.41 ±0.11(stat.) ±0.04(sys.)] x lO'^ 
for the 1 hr field data. The systematic term represents the 
9% absolute calibration uncertainty from Sec. 13.11 It 
does not include current uncertainties in the cosmological 
parameters or in the WiggleZ bias, but these are sub- 
dominant. Combining the two fields yields ilni&Hi'" ~ 
[0.43 ± 0.07(stat.) ± 0.04(sys.)] x lO^^. These fits are 
based on the range 0.075 /iMpc"^ < fc < 0.3/iMpc"^ 
over which we believe that errors are well-estimated 
(failing toward larger scales where there are too few k 
modes in the volume) and under the assumption that 
nonlinearities and the beam/pixelization (failing toward 
smaller scales) are well-understood. A less conservative 
approach is to fit for 0.05 ft-Mpc"^ < fc < 0.8/iMpc~^ 
where the beam, model of nonlinearity and error esti- 
mates are less robust, but which shows the full statistical 
power of the measurement, at 7.4tj combined. Here, 
i^mbmr = [0.40 ± 0.05(stat.) ± 0.04(sys.)] x 10"^ for the 
combined, flmbmr = [0.46 ± 0.08] x lO'^ for the 15 hr 
field and nmbmr ^ [0.34 ± 0.07] x 10"^ for th e 1 hr field. 

To compare to the result in lChang et a/.l(l2010h . flmbj-cir = 
[0.55 ± 0.15(stat.)] x 10~^, we must multiply their rela- 
tive bias (between the G BT intensity map and DEEP2) by 
the DEEP2 bias b = 1.2 dCoil et a/.ll2004l) to obtain an ex- 
pression with respect to feni- This becomes Ohi^hi?' = 
[0.66 ± 0.18(stat.)] x 10^^, and is consistent with our result. 

The absolute abundance and clustering of H I are of great 
interest in studies of galaxy and star formation. Our measure- 
ment is an integral constraint on the H I luminosity function, 
which can be directly compared to simulations. The quantity 
^mbm also determines the ampUtude of 21 cm temperature 



fluctuations. This is required for forecasts of the sensitivity of 
future 21 cm intensity mapping experiments. Since r < 1 we 
have put a lower limit on f^Hi&Hi- 

To determine 17hi alone from our cross-correlation requires 
external estimates of the H I bias and stochasticity. The lin- 
ear bia s of Hi is expected to be ^ 0.65 to ~ 1 at these red- 
shifts e Marm et a/1l2010t iKhandai et a/1l201 ll) . Simulations 
to interpre t IChang etaUi 201C I find values for r between 0.9 
and 0.95 (Kha ndai et a/.ll201ll) . albeit for a different optical 
galaxy population. Measurements of the correlation coef- 
ficient between WiggleZ galaxies and the total matter field 
are consistent with unity in this fc-range (with r,„.opt ^ 0.8) 
( iBlake et a/.ll201 lb . These suggest that our cross-correlation 
can be interpreted as I^hi between 0.45 x 10"'^ and 0.75 x 
10-3. 

Measurements with SDSS (iProchaska and Wolfd l2009h 
suggest that before z = 2, ilni may have already reached 
~ 0.4 X 10^'^. At low redshift, 21cm measurements giv e 
flmiz ~ 0) = (0.43 ± 0.03) x lO'^ (iMartin ef a/.ll2010h . 
Intermediate redshifts are more difficult to measure, and es- 
timates based on Mg-11 lines in DLA systems obs erved with 
HST find Qmiz - 1) « (0.97 ± 0.36) x lO'^ (iRao et al.\ 
HOO^, in rough agreement with z « 0.2 DLA measure- 
ments (iMeiring et al.\ l20TlT) and 21cm stacking (iLah et al.\ 
12007b . This is in some tension with a model where fini 
falls monotonically fro m the era of maximum star formation 
rate jDuffv et a/.ll2012l) . Under the assumption that 5hi = 
0.8, r = 1, the cross-correlation measurement here suggests 
JIhi ^ 0.5 X lO"'^, in better agreement, but clearly better mea- 
surements of 6hi and r are needed. Redshift space distortions 
can be exploited to break the degeneracy between JIhi and 
bias to measure these quantities i ndependently of simulations 
(IWvifliell2008HMasui et a/.ll2010l) . This wifl be the subject of 
future work. 
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